The Best Templates Match Technique For Example Based Machine Translation

نویسندگان

  • Tarek El-Shishtawy
  • A. El-Sammak
چکیده

It has been proved that large-scale realistic Knowledge Based Machine Translation (KBMT) applications require acquisition of huge knowledge about language and about the world. This knowledge is encoded in computational grammars, lexicons and domain models. Another approach – which avoids the need for collecting and analyzing massive knowledge-is the Example Based approach, which is the topic of this paper. We show through the paper that using Example Based in its native form is not suitable for translating into Arabic. Therefore a modification to the basic approach is presented to improve the accuracy of the translation process. The basic idea of the new approach is to improve the technique by which template-based approaches select the appropriate templates. It relies on extracting, from a parallel Bilingual Corpus, all possible templates that could match parts of the source sentence. These templates are selected as suitable candidate chunks for the source sentence. The corresponding Arabic templates are also extracted and represented by a diredted graph. Each branch represents one possible string of templates candidate to represent the target sentence. The shortest continuous path or the most probable tree branch is selected to represent the target sentence. Finally the Arabic translation of the selected tree branch is generated.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Translation Templates From Bilingual Text

This paper proposes a two-phase example-based machine translation methodology which develops translation templates from examples and then translates using template matching. This method improves translation quality and facilitates customization of machine translation systems. This paper focuses on the automatic learning of translation templates. A translation template is a bilingual pair of sen...

متن کامل

Automatic Determination of Number of clusters for creating templates in Example-Based Machine Translation

Example-Based Machine Translation (EBMT), like other corpus based methods, requires substantial parallel training data. One way to reduce data requirements and improve translation quality is to generalize parts of the parallel corpus into translation templates. This automated generalization process requires clustering. In most clustering approaches the optimal number of clusters (N ) is found e...

متن کامل

Template Extraction for a Bidirectional English-Filipino Machine Translation System

A bidirectional English-Filipino Example-based Machine Translation System that learns and uses templates is presented. The system uses machine learning techniques to initially extract templates from a given bilingual corpus. These templates are subsequently used for translating English input text into Filipino and vice versa. The system implements the similarity template learning algorithm perf...

متن کامل

Inducing Translation Templates for Example-Based Machine Translation

This paper describes an example-based machine translation (EBMT) system which relays on various knowledge resources. Morphologic analyses abstract the surface forms of the languages to be translated. A shallow syntactic rule formalism is used to percolate features in derivation trees. Translation examples serve the decomposition of the text to be translated and determine the transfer of lexical...

متن کامل

Automatically Extracting Templates from Examples for NLP Tasks

In this paper, we present the approaches used by our NLP systems to automatically extract templates for example-based machine translation and pun generation. Our translation system is able to extract an average of 73.25% correct translation templates, resulting in a translation quality that has a low word error rate of 18% when the test document contains sentence patterns matching the training ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1406.1241  شماره 

صفحات  -

تاریخ انتشار 2014